Introduction

Column

Introduction

Music is an essential part of many people’s lives, and with digital music streaming platforms like Spotify, discovering new music has become more accessible than ever. Spotify’s algorithm suggests playlists based on users’ listening habits and preferences. One of the most popular playlists on Spotify is “New Music Friday,” updated every week with new releases from various artists. It is curated by Spotify’s editorial team, who select the latest and most popular releases from various genres. However, the question remains: how accurately does this playlist align with personal musical taste and style?

This final portfolio explores the inner workings of Spotify’s music recommendation algorithm, attempting to create a predictive model that accurately predicts a user’s preferred musical genres and styles. The analysis will focus on two playlists - “Songs I Like” and “Songs I Dislike” - to identify patterns in personal musical taste. The “Songs I Like” playlist is a collection of songs that hold personal meaning, representing various genres such as pop, 70s, 80s, 90s musical, and rock. In contrast, the “Songs I Dislike” playlist consists of heavy metal and drill rap songs that do not match personal taste.

By analyzing the features of songs in both playlists, this portfolio aims to provide insights into personal musical preferences and use this information to train a machine learning model. The goal is to create a predictive model that accurately predicts preferred musical genres and styles. Additionally, the analysis will investigate whether this model can predict which songs from popular playlists like “New Music Friday” would resonate with the user, allowing for a more personalized and tailored listening experience.

In summary, this portfolio tries to offer an understanding of Spotify’s music recommendation algorithm and the potential for using machine learning to predict a user’s preferred musical genres and styles.

Column

Corpus: Songs I like

iframe src=“https://open.spotify.com/embed/playlist/6pl0C7qbIl5uoY3Tdf82oa?utm_source=generator” width=“100%” height=“100%” frameBorder=“0” allowfullscreen=“” allow=“autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture”>

Column

Corpus: Songs I dislike

iframe src=“https://open.spotify.com/embed/playlist/4bJQX5w7W4wEnHLmWqUIVY?utm_source=generator&theme=0” width=“100%” height=“100%” frameBorder=“0” allowfullscreen=“” allow=“autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture”>

Feature analysis

Relationship between Valence, Energy, Loudness, and Mode for the songs I like


Spotify’s track-level features, such as valence, danceability, energy, and loudness, offer valuable insights into the underlying patterns of my personal music preferences. I selected these features as they are commonly used to describe the overall mood, intensity, and emotional tone of a song, allowing us to better understand the characteristics that define the songs I like and dislike (Serra et al.). By examining these features, we can gain a deeper understanding of the traits that influence my musical taste.

Based on the first scatterplot, it seems like I tend to like songs that are loud, with a range of loudness values represented in the plot. I also seem to be drawn to songs that are high-energy, as they’re clustered towards the upper-right portion of the plot where both the energy and loudness values are high.

As for the valence of the songs I like, it appears to be centered around 0.5, but mostly falls between 0.5 and 1.0. This suggests that I tend to prefer songs with a positive emotional valence, which could contribute to their appeal.

One interesting thing I noticed in the plot is that the songs I like are equally distributed between major and minor keys, as indicated by the color coding in the plot. This suggests that the mode of the songs I like doesn’t strongly influence my preference for them.

Relationship between Valence, Energy, Loudness, and Mode for the songs I dislike


Looking at the scatterplot of songs I don’t like, I can see that they’re usually not as loud as the songs I enjoy. Even though they still have high energy levels, they often have lower valence. This makes me believe that I might like songs that have a more positive and uplifting vibe, rather than ones that are more downbeat or sad.

It’s interesting to see that, just like the songs I like, the songs I don’t like are spread out pretty evenly between major and minor keys. This tells me that the key of a song doesn’t really play a big role in whether I like it or not.

Another thing I noticed is that the songs I don’t like seem to have a smaller range of loudness and energy compared to the songs I do like. This could mean that I’m more open to different levels of intensity and dynamics when it comes to music I enjoy, while the music I don’t like might have more in common in terms of how loud and energetic they are.

Zooming in on Valence


This plot provides a visual representation of how valence is distributed among two different sets of songs: ‘Songs I like’ and ‘Songs I dislike’.

Valence is a measure of how positive or negative a musical piece sounds, with a scale that ranges from negative to positive values. The plot shows two histograms, one in light blue representing the valence distribution for the songs that I like, and the other in light coral representing the valence distribution for the songs that I dislike.

By looking at the plot, we can see that the valence distribution for the songs that I like is skewed towards the positive end of the spectrum, indicating that the songs the I like tend to have a more positive sound. In contrast, the valence distribution for the songs that I dislike is more skewed towards the negative end of the spectrum, suggesting that the songs I dislike have a more negative sound.

Zooming in on Energy


This plot provides insight into the energy distribution of two different sets of songs: ‘Songs I like’ and ‘Songs I dislike’. Energy is a measure of how intense and active a musical piece is, with a scale that ranges from low to high values. The plot displays two histograms, one in light green representing the energy distribution for the songs that I like, and the other in coral representing the energy distribution for the songs that I dislike.

Upon examining the plot, we can see that the energy distribution for the songs that I like is concentrated between 0.7 and 0.9, with a peak around 0.8. This indicates that the songs that I enjoy tend to have a moderate level of energy, without being too intense or too mellow. In contrast, the energy distribution for the songs that I dislike is concentrated around 0.95, indicating that the songs that I find unfavorable tend to be more energetic.

The findings of this plot suggest that energy may be an important factor in shaping my musical preferences. However, it is essential to note that energy is just one of many factors that can influence an individual’s musical tastes.

Zooming in on Loudness


Musical preferences are complex and can be influenced by various factors, including the loudness of a musical piece. Loudness is an essential characteristic of music that can affect how it is perceived by the listener. It refers to the volume or intensity of a musical piece and is typically measured in decibels.

The plot that highlights the difference in loudness between songs that I like and songs that I dislike may provide insight into how loudness influences my musical preferences. The plot may indicate that my preferred songs tend to be louder than songs that I dislike, suggesting that loudness is an essential characteristic of my musical taste. However, it is crucial to remember that loudness is just one factor that can influence musical preferences.

Track-Level Summary

Overview

Outliers

In any research study, selecting an appropriate corpus is crucial in effectively achieving the research objectives. In the present study, a broad corpus was chosen, which included diverse data related to the research question. However, the large volume of data made it challenging to identify significant patterns or trends that could adequately address the research question.

To overcome this challenge, the decision was made to focus on the outliers in the corpus. Specifically, the analysis focused on the extreme cases that were most divergent from the norm in terms of a specific timbre component. This approach allowed for the isolation and study of the outliers, leading to valuable insights and a better understanding of the factors contributing to their unique timbre characteristics. Ultimately, this approach strengthened the analysis and enhanced the quality of the research findings.

In this study, timbre, the quality of sound that distinguishes different musical instruments, was analyzed using spectral content, which measures the relative strengths of various frequency components that make up the sound. The study focused on a specific timbre component, which was used to isolate and study the outliers in the corpus.

To further investigate this approach, tables were generated to identify the maximum and minimum values of specific timbre components for each playlist. The tables for the maximum and minimum values of c01, c02, c03, and c04 for both the ‘Songs I like’ and ‘Songs I dislike’ playlists showed notable differences in timbre components between the two playlists. For example, the table of maximum and minimum values of c02 for each playlist showed that the highest and lowest timbre values for ‘Songs I like’ were exhibited by the songs ‘What I Like About You’, and ‘The Sailor’s Warning’, respectively, while the highest and lowest timbre values for ‘Songs I dislike’ were exhibited by the songs ‘Murder’ and ‘19 Tini 5’, respectively.

These tables provide insight into how specific timbre components differ between songs that I like and songs that I dislike, which helps to support the approach of focusing on the outliers in the corpus to gain valuable insights into timbre characteristics.

Column 2

Timbre components

c01

Table of maximum and minimum values of c01 for each playlist
Playlist Track Artists value
Songs I dislike Street Sense Shawty Pimp, MC Spade 37.45836
Songs I like Africa TOTO 36.25328
Songs I dislike 1984 Slaughter to Prevail 55.58922
Songs I like Starstruck Years & Years 54.76817

c02

Table of maximum and minimum values of c02 for each playlist
Playlist Track Artists value
Songs I dislike 19 Tini 5 TiniMaine -81.27966
Songs I like The Sailor’s Warning Faela -34.36908
Songs I dislike Murder (feat. Tom Skeemask & GK) DJ Squeeky, Tom Skeemask, GK 145.18006
Songs I like What I Like About You The Romantics 109.29211

c03

Table of maximum and minimum values of c03 for each playlist
Playlist Track Artists value
Songs I dislike Street Sense Shawty Pimp, MC Spade -109.32972
Songs I like Think About Things Daði Freyr -44.19746
Songs I dislike Wounds ColdWorld 59.98358
Songs I like golden hour JVKE 69.93170

c04

Table of maximum and minimum values of c04 for each playlist
Playlist Track Artists value
Songs I dislike 2 Thick DJ Zirk, Tha 2thick Family, Tom Skee, BuckShotz -40.84057
Songs I like The Sailor’s Warning Faela -25.60552
Songs I dislike LIVING LEGEND Scarlxrd 30.74823
Songs I like The Way You Make Me Feel - 2012 Remaster Michael Jackson 29.41128

Chromagrams

Column 1

A chromagram is a visual representation of the distribution of pitches in a musical recording. Comparing the chromagrams of two songs, “State of Unrest” from the playlist of disliked songs and “Shut up and Dance” from the playlist of liked songs, reveals interesting differences in their pitch distribution. In “State of Unrest,” there is a strong concentration of pitches around the D chord, indicating a relatively stable harmonic structure. On the other hand, “Shut up and Dance” shows more variation in pitch distribution, with a wider range of pitches around the F#, G#, and A# chords. This suggests that “Shut up and Dance” has a more complex harmonic structure with more varied chord progressions. These differences in pitch distribution could contribute to the overall appeal of the songs and could be further explored in future analyses.

Column 2

Ceptrograms

Overview

Chart A

A ceptrogram is a visual representation of the distribution of timbral characteristics in a musical recording. Comparing the ceptrograms of two songs, “State of Unrest” from the playlist of disliked songs and “Shut up and Dance” from the playlist of liked songs, reveals interesting differences in their timbral distribution. In “State of Unrest,” there is a strong concentration of timbral characteristics around a certain range, indicating a relatively stable sonic texture. On the other hand, “Shut up and Dance” shows more variation in timbral distribution, with a wider range of timbral characteristics. This suggests that “Shut up and Dance” has a more complex and varied sound texture. These differences in timbral distribution could contribute to the overall appeal of the songs and could be further explored in future analyses.

Column 2

Self-similarity Matrices

Overview

A self-similarity matrix is a visual representation of the similarity between different sections of a musical recording. Comparing the self-similarity matrices of four songs, “State of Unrest” in both timbre and chroma and “Shut up and Dance” in both timbre and chroma, reveals interesting differences in their internal structure.

In the self-similarity matrix for the timbral characteristics of “State of Unrest,” there are clearly defined blocks, indicating repeated patterns in the sound. In contrast, the timbral self-similarity matrix for “Shut up and Dance” shows a more continuous, fluid structure, suggesting a more unpredictable and dynamic sound.

Similarly, the chroma self-similarity matrix for “State of Unrest” shows a highly repetitive structure with clear diagonal lines, indicating the presence of repeated chord progressions. On the other hand, the chroma self-similarity matrix for “Shut up and Dance” shows a more dispersed and varied pattern, indicating a more diverse harmonic structure.

These differences in internal structure could contribute to the overall appeal of the songs and provide insights into the musical composition and arrangement. Further analyses could explore the relationship between these structural characteristics and the emotional and perceptual responses to the music.

column2

column3

Chordograms

Overview

Chart A

WIP

Column 2

Tempograms

Overview

Chart A

For the Tempograms, I selected the two songs from the ‘Songs I like’ playlist with the highest and lowest c01 components based on the track-level-summary (see track-level-summary). Specifically, I chose ‘Africa’ by Toto with the highest c01 value, and ‘Starstruck’ by Years & Years with the lowest c01 value. This selection was made to investigate potential differences in tempo between the songs.

From the plots, it can be observed that both songs have a relatively steady tempo with occasional variations. However, ‘Starstruck’ exhibits more frequent tempo changes, which can be explained by its lower c01 timbre component. The c01 component measures the overall loudness of the song.

In conclusion, the selection of songs based on their c01 component values allowed for an exploration of potential differences in tempo patterns and provided insights into the relationship between c01 values and overall loudness. Additionally, it is worth noting that I tend to prefer songs with a consistent tempo, which may explain my personal preference for songs with similar overall tempos.

Column 2

Trained Model

Overview

Chart A

WIP

Column 2

Conclusion

Column 1

Conclusion

Based on the insights gained from the feature analysis, it appears that there are clear patterns in the musical features that I find appealing and unappealing in songs. This raises the possibility of developing a classification model to predict whether or not I would like a particular song based on its acoustic features.

For example, a decision tree model could be trained on a dataset of songs that I have rated as either liked or disliked, using features such as loudness, energy, valence, and mode as predictors. The resulting model could then be used to predict the likelihood of me liking a new song based on its acoustic features.

While the plots provide some valuable insights into the musical features that I find appealing or unappealing, it’s important to note that these are just a few of the many features that could potentially influence my musical taste. A more accurate classification model would need to take into account a broader range of features, such as tempo, rhythm, instrumentation, and genre, among others. Additionally, the model would need to be trained on a larger and more diverse set of songs to ensure that it can accurately classify songs that I like or dislike across a wider range of styles and genres. Nevertheless, the insights gained from these plots provide a good starting point for developing a more comprehensive model of my musical taste.